Orthogonal Negation in Vector Spaces for Modelling Word-Meanings and Document Retrieval

نویسنده

  • Dominic Widdows
چکیده

Standard IR systems can process queries such as “web NOT internet”, enabling users who are interested in arachnids to avoid documents about computing. The documents retrieved for such a query should be irrelevant to the negated query term. Most systems implement this by reprocessing results after retrieval to remove documents containing the unwanted string of letters. This paper describes and evaluates a theoretically motivated method for removing unwanted meanings directly from the original query in vector models, with the same vector negation operator as used in quantum logic. Irrelevance in vector spaces is modelled using orthogonality, so query vectors are made orthogonal to the negated term or terms. As well as removing unwanted terms, this form of vector negation reduces the occurrence of synonyms and neighbours of the negated terms by as much as 76% compared with standard Boolean methods. By altering the query vector itself, vector negation removes not only unwanted strings but unwanted meanings.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Quantum Logic of Word Meanings: Concept Lattices in Vector Space Models

This paper systematically develops the logical and algebraic possibilities inherent in vector space models for language, considerably beyond those which are customarily used in semantic applications such as information retrieval and word sense disambiguation. The cornerstone of the approach lies in a simple implementation of the connectives of quantum logic as introduced by Birkhoff and von Neu...

متن کامل

Word Type Effects on L2 Word Retrieval and Learning: Homonym versus Synonym Vocabulary Instruction

The purpose of this study was twofold: (a) to assess the retention of two word types (synonyms and homonyms) in the short term memory, and (b) to investigate the effect of these word types on word learning by asking learners to learn their Persian meanings. A total of 73 Iranian language learners studying English translation participated in the study. For the first purpose, 36 freshmen from an ...

متن کامل

Document Image Retrieval Based on Keyword Spotting Using Relevance Feedback

Keyword Spotting is a well-known method in document image retrieval. In this method, Search in document images is based on query word image. In this Paper, an approach for document image retrieval based on keyword spotting has been proposed. In proposed method, a framework using relevance feedback is presented. Relevance feedback, an interactive and efficient method is used in this paper to imp...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003